Picture for Yuheng Zhang

Yuheng Zhang

Offline Two-Player Zero-Sum Markov Games with KL Regularization

Add code
May 13, 2026
Viaarxiv icon

Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards

Add code
Apr 10, 2026
Viaarxiv icon

Beyond Pessimism: Offline Learning in KL-regularized Games

Add code
Apr 08, 2026
Viaarxiv icon

Beyond Semantic Manipulation: Token-Space Attacks on Reward Models

Add code
Apr 03, 2026
Viaarxiv icon

ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction

Add code
Apr 01, 2026
Viaarxiv icon

O3N: Omnidirectional Open-Vocabulary Occupancy Prediction

Add code
Mar 12, 2026
Viaarxiv icon

PanoAffordanceNet: Towards Holistic Affordance Grounding in 360° Indoor Environments

Add code
Mar 10, 2026
Viaarxiv icon

Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies

Add code
Mar 03, 2026
Viaarxiv icon

Interaction-Grounded Learning for Contextual Markov Decision Processes with Personalized Feedback

Add code
Feb 09, 2026
Viaarxiv icon

TagSpeech: End-to-End Multi-Speaker ASR and Diarization with Fine-Grained Temporal Grounding

Add code
Jan 11, 2026
Viaarxiv icon